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Q_i| Abstract. We have analyzed the results of the Gaussian decomposition of 

.^ , the Leiden/Dwingeloo Survey (LDS) of galactic neutral hydrogen for the pres- 

^^ . ence of Gaussians probably not directly related to galactic Hi emission. It 

OO ' is demonstrated that at least three classes of such components can be distin- 

^SJ ' guished. The narrowest Gaussians, obtained during the decomposition, mostly 

represent stronger random noise peaks in profiles and some still uncorrected 
radio-interferences. Many of slightly wider weak Gaussians are caused by in- 
creased uncertainties near the profile edges and with the still increasing width 
the baseline problems become dominating among weak components. Statistical 
f^ \ criteria are given for separation of the parameter space regions of the Gaussians, 

\^ . most likely populated with the problematic components from those where the 

■^ ■ Gaussians are with higher probability describing the actual Milky Way H I emis- 

sion. 

VO i The same analysis is applied also to the Leiden/ Argentina/Bonn survey 

f^ ■ (LAB), a compilation that combines a revised version of the LDS (LDS2) with 

the stray radiation corrected version of the Southern sky survey of the Instituto 

Argentino de Radioastronomia (lAR). It is demonstrated that the selection 

criteria for dividing the parameter space are to a great extent independent 
O ' of the particular survey in use. The situation is more obscure for very wide 

IH I components. In this region the distributions of the components of different 

C/3 , origin seem to be more blended and it is harder to decide on the basis of 

C^ ■ Gaussian parameters alone, whether the corresponding components are caused 

by some high velocity dispersion halo gas in the Milky Way, external galaxies 

or are due to baseline problems, for example. Nevertheless, the presence of 

k> , the baseline problems in the LDS is most likely indicated by the peculiarities 

V^ i of the distribution of the widest Gaussians in the sky. A similar plot for the 

C^ ' northern part of the LAB demonstrates considerably lower numbers of spurious 

components, but there are still problems with the southern part of the LAB. 

The strange characteristics of the observational noise in the southern part of 

the LAB are also pointed out. 
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1. INTRODUCTION 

The Gaussian analysis of the observed H i profiles is a somewhat controversial 
process. On the one hand, if we neglect the saturation and absorption effects, 
suppose that most of the line shape comes from global rotation characteristics 
of the Galaxy and that the galactic H i consists of separate hydrogen clouds with 
equilibrium random velocity distribution, the observed profile can be considered as 
a sum of Gaussian cloud components, shifted relative to each other by differential 
rotation. 

Unfortunately, the actual situation is much more complicated. We must con- 
sider the possibility of intrinsically non-Gaussian contributions to the emission 
(due to groups of atoms with asymmetrical or in any other way pronounced non- 
Gaussian velocity distribution or due to saturation in optically thick regions and 
self-absorption by very cold foreground gas). Except for the simplest profiles, the 
least squares Gaussian analysis is not unique (often several quite different solutions 
may fit the observations almost equally well, and the method of the least squares 
provides no satisfactory means for choosing between these solutions, while other, 
equally good or even better ones, may not be found at all). Strictly speaking, the 
pure method of the least squares is even not valid, when applied to this problem, 
as neither the form of the components nor their number is known, nor can it be 
assumed that the residuals are randomly distributed. The solution is often par- 
tially determined by the number of components introduced, the initial estimates 
of their parameters and only partially by the observed profile. All this makes a 
rigorous Gaussian analysis somewhat illusive. 

These weaknesses of the Gaussian analysis were understood rather early (Kaper 
et al. 1966; Takakubo & van Woerden 1966). Nevertheless, the method contin- 
ued to be used up to the present time (e.g. Gappa de Nicolau & Poppel 1986; 
Poppel et al. 1994; Vcrschuur & Peratt 1999; Verschuur 2004). This indicates 
that besides weaknesses the method must have also some benefits (some aspects 
briefly discussed in Sec. 2.2 and in Hand 2000, hereafter Paper I). Proceeding 
from this, we have created a new fully automatic specialized computer program 
for the decomposition of large 21-cm Hi line surveys into Gaussian components. 
This program has been described in Paper I and it represents the profiles as formal 
sums of only positive Gaussian functions without considering the actual line for- 
mation processes. During the decomposition process the special attention is paid 
to the following features: 

1. Several quite different solutions may often fit the observations almost equally 
well. To choose from these solutions, it was supposed that general properties 
of the hydrogen distribution are somewhat correlated at neighboring sky 
positions and therefore the program tries to find similar decompositions for 
corresponding profiles. 

2. With the increasing complexity of the observed profiles, the number of Gaus- 
sians in decompositions usually grows rapidly and the values of their param- 
eters become mutually dependent. To reduce this problem, special means 
have been used to keep the number of Gaussians as small as possible. 

In this paper we describe the first results from the application of the new 
decomposition program on the Leiden/Dwingeloo Survey of galactic neutral hy- 
drogen by Hartmann & Burton (1997, hereafter the LDS) and on the recent Lei- 
den/Argentine/Bonn compilation of galactic Hi results by Kalberla et al. (2005, 
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hereafter the LAB). We start with the description of the data used, the results ob- 
tained and possible approaches to the interpretation of the decomposition results 
(Section 2) and then proceed with the analysis of the results for the LDS (Section 
3) and the LAB (Section 4). In this analysis, the main attention is paid to the sep- 
aration of the features, most likely corresponding to the real emission of galactic 
Hi, from those representing different problems during observations, reduction and 
decomposition of the surveys. Such a separation may be important at least in two 
aspects. On the one hand, such separation may help us to identify the problems 
with the observational data; on the other hand, it may give us a cleaner sample of 
Gaussians for studying the properties of the Milky Way H i. At the same time, it 
is important to understand that the discrimination between different Gaussians, 
as described in this paper, is statistical in its nature and as the distributions of 
different types of Gaussians partially overlap, it cannot be used as the only and 
final criterion for every particular Gaussian or profile. 

In the present paper, we turn our main attention to the minority of the Gaus- 
sians, most likely describing different types of problems, which could be searched 
for using the components, recognized as probably suspicious. In the further pa- 
pers we will continue the study of the majority of the Gaussians, which may with 
higher probability measure the spectral signatures of higher astronomical interest. 

2. INTERPRETATION OF THE DECOMPOSITION RESULTS 



2.1. The data 

As test data for our decomposition program we used the original observed pro- 
files of the LDS (Hartmann 1994), reduced to Tb by P. M. W. Kalberla at Bonn 
University (the reduction procedures described in Hartmann 1994 and Hartmann 
et al. 1996). These are not exactly the same as those published by Hartmann 
& Burton (1997) on a CD-ROM. We have used the profiles before averaging the 
repeated observations at identical sky positions and before re-gridding them onto 
a common lattice. This choice was made, as averaging and re-gridding smear the 
differences between neighboring profiles and may have undesirable influence on 
the Gaussian decomposition process (see Introduction of Paper I). In this original 
form the survey contained 184698 profiles, which after decomposition were repre- 
sented by 1 493 187 Gaussians. For a brief comparison also 206 671 profiles of the 
published version of the LDS were decomposed into 1 644665 Gaussians. 

Recently a similar Southern sky high sensitivity H i survey at (5 < —25° was 
published by Bajaja et al. (2005) (lAR). lAR and the LDS with the revised stray- 
radiation and baseline corrections (LDS2) make up the LAB. The specifications 
for LDS2 and lAR closely match each other, but all the data reduction and cali- 
bration procedures were carried out entirely independently for both of the surveys. 
Proceeding from this, also the Gaussian decomposition was carried out separately 
for the LDS2 and the lAR. Once again, the original data were used for both sur- 
veys. As for the LDS2 the repeated observations were not averaged any more, but 
the final profiles were selected on the basis of the best agreement of their Gaus- 
sian decompositions with the decompositions of the neighboring profiles, we used 
these preselected profiles for the analysis. In the case of the lAR, we first used for 
the decomposition the original 1008-channel data of all observed profiles and for 
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repeated observations we applied the same selection criteria as used for the LDS2. 
This procedure gave us 1 064808 Gaussians per 138 830 profiles for the LDS2 and 
444 573 Gaussians per 50980 profiles in the case of the lAR. 



2.2. The usage of Gaussians 

In general, the results of these Gaussian decompositions may be interpreted 
from two completely different points of view: 

1. Gaussian parameters may be considered just as a compact means for repre- 
senting the observed data without providing any physical interpretation to 
these parameters, or 

2. we may want to derive from the parameters of the obtained Gaussians direct 
information regarding the structure of the interstellar medium. 

For the first purpose, even in the case of the most complicated profiles, hun- 
dreds of channel values in the profile are replaced by some tens of Gaussian com- 
ponents, while physically significant information, such as mean velocities and the 
H I content of the H i features, can still be as easily extracted as from full pro- 
file data (Shane 1971). If correctly performed, the decomposition just discards 
observational noise and in this case we must consider all obtained Gaussians as 
real as the original profile data and the criteria for performing the decomposition. 
Moreover, usually some specific features in the observed profiles (not necessarily 
corresponding to some distinct gas clouds in physical space) are represented by 
some specific set of Gaussians, which can be found from the overall data-set more 
easily than un-parameterized spectral features. As noted by Verschuur & Peratt 
(1999), such Gaussian analysis allows us to characterize general properties of the 
profiles from region to region in the sky and to draw conclusions, based upon 
similarities and differences in profile shapes. Verschuur has used this approach 
to study the relations between the different line- width regimes of the H i in the 
local interstellar medium and the critical ionization phenomenon (Verschuur & 
Schmelz 1989; Verschuur & Magnani 1994; Verschuur & Peratt 1999; Verschuur 
2004). This is also the way of interpretation used in the present paper. We are 
looking for patterns in the distribution of Gaussians, corresponding to different 
problems in the obtained decomposition. 

The second approach is much more complicated as it is well known that 
Gaussians may yield direct information regarding the structure of the interstel- 
lar medium only for the simplest profiles, where at least some Gaussians are well 
separated. This considerably reduces the usefulness of the Gaussian decomposi- 
tion in directions close to the galactic equator, where the line of sight may contain 
hundreds of times more gas than at higher latitudes. The situation could not be 
improved even by increasing the resolution of observations, as the shape of the 
emission spectra does not change greatly with angular resolution (Baker & Bur- 
ton 1979). This is so because near the galactic plane the more or less continuous 
distribution of H I extends to distances much larger than at high latitudes and the 
velocity crowding squashes a lot of space into a few kilometers per second. In- 
dividual interstellar components blend completely, and little structure is revealed 
by increasing the resolution. Only when an emission feature has a very odd veloc- 
ity or is considerably brighter than its surroundings, so that it is not blended by 
other emission, the Gaussian analysis may still provide some information on the 
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properties of the interstellar gas (limits on velocity dispersion and temperature, 
for example). Due to these considerations, we first use for our discussions only the 
decomposition results obtained for relatively high galactic latitudes (|5| > 30°), 
but afterwards we test also how much and in what general properties of the de- 
composition results differ for the regions near the galactic plane and farther away. 
However, even if we consider only relatively simple profiles at high galactic 
latitudes, some of the well separated Gaussians, obtained during the decomposi- 
tion process, may correspond to other features in profiles than the real emission 
from galactic Hi. Therefore, when in an ideal case, the Gaussian analysis can be 
used to separate the useful signal from the observational noise, in reality such a 
separation is never perfect. For example, some Gaussians obtained may still rep- 
resent the inaccuracies in the termination criteria of the decomposition process, 
when stronger noise peaks are still fitted by Gaussians (actually, for our program 
these criteria were deliberately chosen in the way preferring fitting of some noise 
to loosing a weak signal - see Paper 1) , or suspicious features (radio-interferences, 
bad baseline and so on) contained in profiles themselves. Proceeding from this, the 
main focus of this paper is on the question, whether it is somehow possible to sep- 
arate such Gaussians from other, physically more founded ones, and in this way to 
further clean the decomposition results. As a working hypothesis, we expect that 
the distributions of the parameter values of the Gaussians, representing different 
artifacts, may be distinguishable from those of the Gaussians corresponding to the 
real H i emission of the Milky Way and it may be possible to isolate the regions of 
the parameter space dominated by components of one or another origin. 

3. LDS 



3.1. Narrow Gaussians 



For separation of different 
kinds of Gaussians their distri- 
bution in the plane of height 
and width seems to be most 
informative. The height of 
a Gaussian is defined by the 
value of the central brightness 
temperature Tbo > from the 
standard Gaussian formula 



Th = Tboc 



^r;^ — 



(1) 



where Tb is the brightness 
temperature and V is the ve- 
locity of the gas relative to the 
Local Standard of Rest. Vc is 
the velocity corresponding to 
the center of the Gaussian. We 
characterize the widths of the 
components by their full width 
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Fig. 1. Frequency distribution of the param- 
eter values in (Ig(rbo), Ig(FWHM)) plane for all 
Gaussians, corresponding to profiles at galactic 
latitudes |6| > 30°. fsodensity lines are drawn in 
the scale of lg(A^ + 1) with the interval of 0.25. 
The thick solid and dashed red lines represent the 
selection criteria discussed in the text. 



U. Hand, P. M. W. Kalberla 



at the level of half maximum (FWHM), which is related to the velocity dispersion 
av by a simple scaling relation FWHM = \/8\n2av- 

To avoid the complicated profiles near the galactic plane, we present in Fig. 1 
the (Ig(rbo), Ig(FWHM)) distribution of all Gaussians for the LDS profiles at 
galactic latitudes \b\ > 30°. From this figure we can see that most frequently 
the Gaussians have the heights between about 1 ^ Tbo ^ 10 K and widths 3 ^ 
FWHM Si 30 kms^ . These parameters are in general agreement with the usual 
two-phase models of the atomic interstellar medium, where one phase is cold with 
temperatures of about 100 K (CNM) and the other is warm with temperatures 
of several thousands degrees (WNM) . Considering also the temperature variations 
and the additional line broadening due to the macroscopic turbulent motions in the 
interstellar structures {ay ~ 2 — 5 kms~ according to Burton 1992), the widths of 
corresponding H i emission lines are in the range of about 1 ^ FWHM ^ 10 km s^ 
for CNM (Crovisier 1981) and 
12 ^ FWHM ^ 40 kms"^ 
for WNM (Mebold 1972). 
There are, however, concentra- 
tions of Gaussians also around 
Ig(Tbo) « -0.4, Ig(FWHM) « 

and Ig(rbo) « -0.9, 
Ig(FWHM) « 2.1. 

The first of these concen- 
trations extends from rela- 
tively weak Gaussians up to 
the intensities of several tens 
of kelvins, and the widths are 
mostly below the limit corre- 
sponding to the kinetic tem- 
perature of the coldest H i ob- 
served (Verschuur & Knapp 
1971, 1972; Braun & Burton 
2000). Therefore, it is likely 
that these components are not 
directly related to the emis- 
sion of the galactic gas, but 
represent some artifacts of the 
profiles or their Gaussian de- 
composition process. An in- 
spection of corresponding pro- 
files confirms this guess. In 
Fig. 2 two examples are given. 
The upper panel represents a 
H I profile, measured at / = 
228.9°, b = 54.5° and de- 
composed into 17 Gaussians of 
which 13 have widths around 

1 kms~ . One of these nar- 
row Gaussians is rather strong 
with the height of nearly 19 K, 
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Fig. 2. Two examples of the profile with very 
narrow Gaussians. The observed profiles are plot- 
ted with the stepped green lines, individual Gaus- 
sian components with the thin smooth cyan lines, 
the Gaussian representation of the profiles with 
the thick smooth red lines and the residuals with 
small magenta points. Due to the large number 
of Gaussians in lower panel only the parameters 
of very narrow components are given. 
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representing the central peak of the typical radio-interference, and most of the 
other Gaussians fit the fading oscillations on both sides of the central peak. There 
are also two narrow Gaussians (not shown in Fig. 2) at higher velocities, which 
represent some stronger noise peaks, not directly related to the illustrated inter- 
ference pattern. The lower panel illustrates the Gaussians fitting the noise peaks. 
As the noise level at the velocities around 100 kms~^ is higher (probably due to 
the corrected radio-interference) than in other regions of the profile, this higher 
noise have brought about a large number of spurious Gaussians. 

As the described very narrow Gaussians are separated in Fig. 1 from the dis- 
tribution of the wider ones by a clearly visible "valley" of relatively underrepre- 
sented values of Gaussian parameters, it seems rather safe to draw the separation 
line between the Gaussians describing the emission of the galactic H i and those 
representing the noise and radio-interferences, along the bottom of this valley (line 
1 in Fig. 1). To select the shape and placement of this line, we first counted the 
Gaussians in (0.05 x 0.05) bins in Ig(rbo) and Ig(FWHM), found for every strip 
of -0.3 < Ig(Tbo) < 1.2 the value of Ig(FWHM), corresponding to the bin with 
the smallest number of Gaussians and fitted the parabola through the obtained 
(Ig(Tbo), Ig(FWHM)) pairs. Finally, the parameters of the parabola were adjusted 
by demanding that the sum of the numbers of Gaussians in the bins, through which 
the parabola is drawn, should be minimal. In some approximation the result. 



Ig(FWHM) = 0.046 * lg(ri 



bOJ 



0.074 *lg(Tbo)-f 0.104, 



(2) 



could be used as a selection criterion to separate the Gaussians representing inter- 
ferences and noise peaks from those describing the properties of the emission of 
the galactic H i. 

3.2. Weak Gaussians of intermediate widths 



Around Ig(rbo) ~ and 
Ig(FWHM) w the valley in 
Fig. 1 turns towards the wider 
Gaussians, broadens and be- 
comes less deep. Therefore, it 
is interesting to check, if here 
also the region of underrepre- 
sented values of Gaussian pa- 
rameters may help us to sep- 
arate the components describ- 
ing different phenomena. The 
answer seems to be yes. It is 
known that the receiver band- 
bass is never square. As a re- 
sult, the extreme edges of the 
obtained spectra are steeply 
falling off to zero intensity and 
after the bandpass removal the 
intensities in these channels become unreliable due to the division by reference. 
Moreover, the usual methods of baseline fitting are poorly constrained in these 
regions. To take this into account, during the decomposition we have not used 




;i^ 



0.0 0.5 



Fig. 3. The frequency distribution of the pa- 
rameters of the Gaussians near the profile edges. 
The designation is the same as in Fig. 1. 
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the data from 64 channels with the most negative velocities and from 128 chan- 
nels with the highest positive velocities. However, the excess noise and baseline 
problems at the edges of the reduced profiles are not confined by sharply defined 
velocity ranges. In some profiles they can be detected in wider regions, in other 
profiles in more limited regions. To study how this is expressed in the "language" 
of Gaussians, we plot in Fig. 3 the distribution of the Gaussians, which have cen- 
tral velocities closer to the accepted profile ends than 64 channels. We can see 
that these Gaussians populate the region separated in Fig. 1 from the main body 
of the distribution by the valley running towards the upper left-hand corner of the 
plot. 

In Fig. 4 two examples are 
given. The upper panel gives 
the Hi profile, measured at 
I = 211.5°, b = 17.0° and de- 
composed into 12 Gaussians, 4 
of which have central veloci- 
ties higher than 330 kms~ - 
the velocity limit, correspond- 
ing to the 192th channel from 
the positive velocity edge of 
the observed profile. We can 
see that at these extreme ve- 
locities the baseline has been 
drawn too low, causing the re- 
sulting mean profile to arise 
above the zero intensity level. 
In the Gaussian decomposition 
this behavior is expressed by 
adding a rather weak (Tbo ~ 
0.05 K) but broad component 
with FWHM « 83 kms"\ 
The profile also becomes wavy 
in this region. As a result, the 
decomposition program adds 
two more weak Gaussians at 
velocities Vc ~ 333 kms^^ 
and 375 kms^ . Finally, as 
the program has considered 
the increase of the profile mean 
channel values at extreme velocities as a sign of presence of a possible useful sig- 
nal, the actual noise level of the profile is underestimated. As the decomposition 
program tries to reduce the level of residuals to the value near the estimated noise 
level of the profile, this results in several very narrow Gaussians, representing the 
strongest noise peaks spread all over the profile. 

The narrow Gaussians in Fig. 4 could be detected and rejected from the ob- 
tained Gaussian representation of the LDS using the selection criterion 1 (Eq. 
[SI discussed above. As demonstrated by Fig. 3, the wider Gaussians near the 
profile edges in Fig. 4 fall to the left of the region of underrepresented values of 
the Gaussian parameters in Fig. 1. Therefore, it seems plausible that even after 




Fig. 4. Two examples of the profiles with the 
problems at the positive velocity edge. The no- 
tation is the same as in Fig. 2. 



Gaussian decomposition of the H I surveys 



the valley in the distribution of the Gaussian parameters in Fig. 1, turns up near 
Ig(Tbo) ~ and Ig(FWHM) w 0, it still may be interpreted as the division between 
Gaussians, representing the actual emission of the galactic H i and those describ- 
ing the observational, reductional and decompositional problems. On the basis of 
these considerations, we followed the underpopulated region to even wider Gaus- 
sians and approximated the run of the location of the most sparsely populated 
parameter values with the parabolic curve (solid line 2 in Figs. 1 and 3). 

However, in Fig. 3 we can also see that in the region of the widest Gaussians a 
considerable fraction of them fall to the right of the parabolic curve, determined 
from the location of the distribution minimum in Fig. 1. A closer inspection of 
the situation indicates that the majority of these Gaussians have central velocities 
outside the range of channels, actually used in the decomposition process. That 
means, they are similar to the FWHM « 83 kms^ Gaussian in Fig. 4, but with its 
center not at Vc ~ 340 kms^ , but shifted beyond the right border of the figure. 
These Gaussians arise in the cases where the baseline has been drawn progressively 
lower and lower towards the end of the used velocity range and the resulting 
profile continues to rise higher and higher above the zero level when approaching 
the velocity limit of the profile, as illustrated in the lower panel of Fig. 4. The 
parameters of such Gaussians are poorly determined as they are estimated from 
the relatively small number of profile channels, covering less than a half of the 
full extent of the component. These Gaussians could be easily recognized by 
demanding that all the accepted components must have their central velocities 
inside the velocity range, used for decomposition. 

3.3. The broadest Gaussians 

We have determined the shape and location of curve 2 in Fig. 3 in the same way 
as described for curve 1. However, when at lower values of FWHM curve 2 can be 
rather easily determined from the data used for Fig. 1, this becomes increasingly 
uncertain when moving to higher values of FWHM. Above Ig(FWHM) k, 1.35 the 
valley bifurcates and the curve may actually follow even two completely different 
passages: the one indicated in Fig. 1 with the solid line, or the other one, indicated 
by the dashed line. The main difference between these two possibilities is that 
in the first case the widest Gaussians, obtained during the decomposition, are 
excluded from the region corresponding to the components representing the actual 
emission of the galactic H i, and in the second case, they are included in this region. 
Therefore, it is important to know what is actually represented by such Gaussians. 

Some decades ago, Field et al. (1969) demonstrated that the Hi could be 
considered as a two-phase medium, where much of the gas is observed to be either 
WNM with T - 10'' K or CNM with T - 100 K (Kulkarni & Heilcs 1987; Dickey 
& Lockman 1990). These temperatures correspond to line-widths mostly below 
21 kms^^ and the corresponding Gaussians have Ig(FWHM) <, 1.9 even if we allow 
for realistic turbulent motions in the gas (Mebold et al. 1982; Kulkarni & Fich 
1985). Kalberla et al. (1998) have argued in favor of the existence of some neutral 
gas with velocity dispersion as high as 60 — 80 kms~ in the halo. The Gaussians, 
corresponding to such gas may have Ig(FWHM) ^ 2.3, the limit still considerably 
below the highest values of Ig(FWHM) in Fig. 1. Therefore, it seems clear that 
the widest Gaussians in Fig. 1 cannot be interpreted as the representation of the 
real H i emission and this favors the solid line 2 as a selection criterion. 
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However, what is then rep- 
resented by the widest Gaus- 
sians rejected by the second se- 
lection criterion? To discuss 
this, we first turn to the distri- 
bution of the weak Gaussians, 
remaining to the left of the 
solid curve 2 in Fig. 1, in the 
central velocity - line-width 
plane (Fig. 5). Here we can see 
that the widest Gaussians are 
concentrated in their veloci- 
ties mainly around the value of 
Vc = kms~^ with the dis- 
tribution extending to slightly Fig. 5. The distribution of the weakest Gaus- 
more than 150 kms~^ at both sians (those remaining to the left of the solid 
sides. Less prominent con- curve 2 in Fig. 1) in the central velocity - line- 
centrations of wide Gaussians ^idth plane. The isodensity lines are drawn in 
are at higher velocities (ap- ^^^ ^^ale of lg(iV + 1) with the mterval of 0.25. 
proximately at —390 ^ Vc ^ 

—230 kms~^ and 180 ^ Vc ^ 280 kms"^). There is a certain excess of wide 
Gaussians also near the profile edges, but we have already discussed them above. 





Fig. 6. The sky distribution of the widest weak Gaussians in galactic co- 
ordinates. The color-scale represents the width (in units of Ig(FWHM)) of the 
widest Gaussian obtained for a given sky position and the gradation is chosen to 
enhance the contrast of quadrangular fields. 

In Fig. 6 we present the sky distribution of the Gaussians of the concentration 
near the zero velocity in Fig. 5 (velocities —152 ^ Vc ^ 160 kms~^). We can 
see that in surprisingly many places (best visible at / > 180° and b < —30°) 
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these components form quadrangular "clouds" in the sky (a similar pattern is not 
visible if we use the Gaussians lying to the right of the solid curve 2 in Fig. 1). 
The size of these clouds is often 5 x 5° or an integer multiple of this. If we recall 
now that the LDS observations were made by 5 x 5° fields and the same fields 
were involved also in bandpass removal (Hartmann 1994), it seems that at least 
a considerable number of wide Gaussians must be due to some observational or 
reductional problems specific to each 5x5° field and cannot describe the actual 
properties of galactic H i. This is the main justification for choosing the second 
selection criterion as indicated by the solid line 2 in Fig. 1, corresponding to 



Ig(rbo) = 0.547 * lg(FWHM)2 - 1.492 * Ig(FWHM) + 0.235. 



(3) 



3.4- The high velocity dispersion halo gas 



When constructing Fig. 6, 
we tried several versions of 
the color-scale to find the 
one, which enhances most the 
contrast of the quadrangular 
structures. We found that this 
pattern is dominating among 
the Gaussians with widths 
above Ig(FWHM) > 1.8, a re- 
sult in good agreement with 
Fig. 5, where we can see that 
the concentration of very wide 
Gaussians around zero veloc- 
ity extends down to the widths 
Ig(FWHM) > 1.8. At the 
same time, if to draw a figure 
similar to Fig. 6, but for wide 
Gaussians at velocity intervals 
-390 < Vc < -230 kms"^ 
and 180 < Vc < 280 kms"\ 
we cannot see the quadrangu- 
lar pattern as in Fig. 6 and 
the picture is dominated by 
two concentrations of points, 
clearly coinciding in location 
and shape with the northern 
tip of the Magellanic Stream 
and the high-velocity cloud 
(HVG) complex AC. There- 
fore, it seems that not all wide 

Gaussians rejected by the selection criterion of Eq. Q, are due to observational 
or reductional problems, but some of them correspond to real gas. 

Moreover, at this level of discussion it remains unclear if and how Gaussians 
represent the high velocity dispersion halo gas (HVDHG) discussed by Kalberla 
et al. (1998). From their Fig. 1 we may estimate that corresponding Gaussians, if 




-3S4 -256 -12S US 256 3S4 

Fig. 7. The examples of the profiles with 
broad Gaussians. The cases represented in the 
upper and lower panel are discussed in the text. 
The notation is the same as in Fig. 2. 
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present in our decomposition, must have the heights of the order of Tbo ~ 0.05 K. 
In our Fig. 1 they he to the left of the sohd hne 2 and would be rejected by the 
brute force appKcation of the second selection criterion. We postpone the detailed 
discussion of this question to further papers, but we point out here that we believe 
that our decomposition has detected such gas. Most likely the corresponding 
Gaussians may add somewhat to the concentration around Vc = —30 kms^ , 
Ig(FWHM) = 2.2 in Fig. 5 and form part of the more diffuse backgroimd in Fig. 6. 
Nevertheless, it is clear that most of the widest Gaussians in Fig. 1 do not represent 
the properties of the real gas, but it is harder to draw a clear cut separation line 
between different types of Gaussians in this region than at smaller widths. In such 
a separation process we cannot rely only on heights and widths of Gaussians, but 
we must also consider at least their velocities. 

Actually it is as hard to classify corresponding Gaussians as to decide when 
the broad wings of the H i emission lines in the observed profiles represent the 
HVDHG and when they are caused by some problems, most likely the badly 
behaving baselines or stray-radiation corrections. We illustrate this in Fig. 7, 
where the profile in the upper panel is selected from the light 5x5° field in Fig. 6 
(the median width of the widest Gaussians of every point of the field is equal to 
19.7 kms~^) and the profile in the lower panel from the dark one (the median 
equals to 187 kms^ ). In this way we may expect that the profile in the lower 
panel probably has its widest Gaussian due to baseline problems and in the upper 
panel the widest Gaussian is more likely caused by HVDHG. However, the widest 
Gaussians in both panels have nearly the same widths and intensities and also the 
general shapes of the profiles seem to be rather similar. 

From the observed profile in the upper 
panel of Fig. 7 it may seem that the decom- Table 1. A four-component fit. 

position with 4 Gaussians (with parameters 

given in Table 1) may give a better model Vc FWHM Tbo 

with more realistic widths of Gaussians. A 

closer inspection does not confirm this expec- 
tation. Before decomposition the noise level 
of the signal-free regions of this profile was 
estimated to be equal to 0.0864 K. If we di- 
vide the profile into two regions, where the 
total intensity of the obtained Gaussians is 



-51.994 


18.660 


0.780 


-13.770 


12.198 


0.705 


-28.634 


71.642 


0.517 


83.636 


36.210 


0.087 



below 10% of this noise level (signal-free region) and above 10% of the noise level 
(the region with signal), we can compute in both regions the rms of residuals 
after subtracting the obtained Gaussians from the observed profile. In the case 
of a three component fit the corresponding numbers are 0.0865 K and 0.0864 K, 
respectively. We see that this model identifies nearly equal noise levels in both re- 
gions of the profile - a result we should expect (by applying corresponding weights 
we have taken into account the dependence of the noise on signal strength, as 
described in Paper I). For a four component fit we receive 0.0871 K and 0.0790 K, 
clearly indicating that we have used too many Gaussians and made the rms of the 
residuals in the region containing the signal, too low. 

Therefore, on the basis of the present discussion, we cannot determine unam- 
biguously the reason for the broad Gaussians. Some of them may correspond to 
real H i emission, others to artifacts due to the problems in baseline determina- 
tion. However, even for wide Gaussians Figs. 1, 5 and 6 may help us separate 
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the regions of the parameter space, where such problems exist, from those, where 
most components with high probabihty describe the real emission. 

3.5. Low galactic latitudes and the Atlas 



So far we have discussed 
mainly the data for the re- 
gions at l&l > 30°, where the 
hydrogen profiles are simpler 
than near the galactic plane. 
At the same time, we have 
pointed out that the Gaussian 
analysis allows us to charac- 
terize general properties of the 
profiles from region to region 
in the sky and to draw con- 
clusions based upon similari- 
ties and differences in profile 
shapes. Therefore, it is inter- 
esting to check how different 
the (Ig(Tbo), Ig(FWHM)) dis- 
tribution of Gaussians at low 
galactic latitudes is in com- 
parison to Fig. 1. The corre- 
sponding results are presented 
in Fig. 8. We can see that 
the main difference between 
Figs. 1 and 8 is a greater extent 
of the distribution towards the 
stronger and wider Gaussians 
in Fig. 8. This means that 
at low galactic latitudes many 
Gaussians are higher and/or 
wider than at high latitudes - 
the property caused by a com- 
plex superposition of low opti- 
cal density gas in and near the 
galactic plane, which could not 
allow us to distinguish in many 
cases the contribution of every 
single concentration of H i. 




0.0 0.5 



Fig. 8. Frequency distribution of the parame- 
ters of all Gaussians corresponding to profiles at 
galactic latitudes |&| < 30°. 




0.0 0.5 



Fig. 9. The same as Fig. 1, but for the pub- 
lished version of the LDS. The plot is scaled to 
the same number of profiles as in Fig. 1. 



However, despite the presence of large numbers of Gaussians, representing the 
total emission of unknown numbers of actual gas concentrations with partly un- 
known properties, Figs. 1 and 8 are in general rather similar. At least both of these 
plots can be used to the same extent for the separation of the Gaussians, arising 
most likely from the real Hi emission from those probably caused by different 
observational and reductional problems and there is even no need to considerably 
change the selection criteria described above. In both figures we can also see sim- 
ilarities in the distribution of the Gaussians with realistic parameters: two main 
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concentrations in the width interval 3 < FWHM < 35 kms^ . Therefore, it is 
possible to obtain from the near plane data at least some results similar to those 
obtainable at higher latitudes. 

When for the testing of the decomposition program we used the original profiles 
as described above, it is also interesting to check how much the decomposition 
results are affected by the averaging of the re-observed profiles and by re-gridding 
of the whole survey. Therefore, in Fig. 9 we present as an example an analog of 
Fig. 1 for the published version of the LDS. We can see that this version of the plot 
is rather similar to that of Fig. 1. The most significant difference is that in the 
decomposition of the LDS Atlas data the resulting Gaussians are somewhat weaker 
than in the case of original profiles. This is best visible if we compare the locations 
of the distribution maximums around Ig(rbo) ~ 0.5 and Ig(FWHM) « 0.6 in Figs. 1 
and 9 and this may be explained as a result of interpolation between profiles with 
slightly differing locations of the line peaks - these peaks have been smoothed 
down. The decomposition of the Atlas data contains also larger numbers of weak 
Gaussians and the weakest components are weaker than in the case of the original 
LDS. However, this is a rather natural result, as both, re-gridding (interpolation) 
and averaging of the observed profiles reduce the mean noise level of the results 
and force the decomposition program to fit more weak Gaussians to the data. 

The comparison of Figs. 1, 8 and 9 also illustrates the uncertainties of the 
selection criterion 2 for the widest Gaussians. When for widths Ig(FWHM) <, 1.35 
the line selected on the basis of Fig. 1 is more or less acceptable also for other 
cases, above this value for |6| < 30° the dashed line seems to be more acceptable 
and for the Atlas data both versions of the line 2 are rather arbitrary. 

4. LAB 



To cover the total sky, this 
compilation combines a re- 
vised version of the LDS with 
the lAR survey. We discuss 
the properties of both surveys 
individually. 

11. LDS2 

When discussing the re- 
sults for the original version 
of the LDS, we identified in 
the (Tbo, FWHM) distribution 
three regions of the Gaussians, 
corresponding most likely to 
different problems, occurring 
during the observations and 
data reduction. These Gaus- 
sians represented the noise, 
radio-interferences, increased 
uncertainties near the profile 
in stray-radiation corrections. 




-Z.O -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 1.0 

Fig. 10. The same as Fig. 1, but for all profiles 
of the LDS2. The selection criteria derived for the 
LDS are still indicated by solid and dashed orange 
lines, but those used for the LAB are in solid red 
lines. 



edges and problems in baseline determination or 
For the LDS2, included into the LAB, the stray- 



Gaussian decomposition of the H I surveys 



15 



radiation corrections and the baselines were recalculated (Kalberla et al. 2005) 
and therefore it is interesting to compare the decomposition results for the LDS 
and the LDS2. The (Tbo, FWHM) distribution of the Gaussians obtained from the 
decomposition of the LDS2 is given in Fig. 10. In this case, all the profiles of the 
survey are used for the plot and therefore it corresponds to the sum of the distri- 
butions given in Figs. 1 and 8. From the comparison of these distributions we can 
see that in the LDS2 the widest Gaussians obtained for the LDS are missing, but 
there are no considerable changes in other regions of problematic Gaussians. This 
may be considered as an indication that the recalculation of the stray-radiation 
corrections and baselines has improved the results. 




Fig. 11. The sky distribution of the widest weak Gaussians of the LDS2 in 
galactic coordinates. The color-scale represents the width (in units of Ig(FWHM)) 
of the widest Gaussian obtained for a given sky position, and the gradation is 
chosen to enhance the contrast for the Southern sky. 

To further check the situation with the wide Gaussians, we present in Fig. 11 for 
the LAB the distribution of the widest Gaussians in the sky. This picture is based 
on the same selection criteria as used for Fig. 6. Only, to stress some Southern sky 
features, the center of the color-scale is downshifted to Ig(FWHM) = 1.5, but this 
has no considerable effect on the appearance of the Northern sky. When compared 
to Fig. 6, it is obvious that the checkered pattern, which was so conspicuous in 
the case of the LDS has nearly disappeared in the LDS2, and therefore we may 
conclude that the stray-radiation corrections and base-line determinations for the 
LDS2 have been made at least much more homogeneously than for the LDS. 
Concerning Fig. 10, this also means that in the case of the upper part of the 
selection criterion 2 the solid line preferred for the LDS seems now rather obsolete 
and the dashed curve is a much more attractive choice. Therefore, at least on 
the basis of the distribution of the Gaussian parameters most of the weak wide 
components seem to correspond to some real population of galactic Hi - most 
likely the HVDHG, discussed by Kalberla et al. (1998). 
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.2. lAR 



When we started the decomposition of the lAR, the first results were rather 
disappointing and surprising: on an average, we got per every profile about 1.65 
times more Gaussians than in the case of the LDS or the LDS2 and most of these 
Gaussians were relatively narrow and weak. This was exactly the behavior that 
may be expected from the decomposition program, if the estimates of the noise 
level of the profiles are too low. In this case the program tries to reduce the 
rms of the residuals below the actual noise level of the survey and the only way 
to achieve this is to add into the decomposition many weak narrow components, 
which represent the strongest peaks of the observational noise. Therefore, we first 
checked our determination of the noise level in signal-free regions of the profiles. 
We used several different algorithms for estimating this noise level (they have been 
described in detail in Paper I) and concluded that the results of different methods 
agreed very well with each other and with the estimate for the final mean rms 
noise of the database given by Bajaja et al. (2005). From this we concluded that 
the problem cannot lie in the determination of the noise level of the signal-free 
parts of the profiles. 

Another possibility was that 
the problems may hide in the 
usage of the radiometer equa- 
tion for the description of 
the noise strength dependence 
on signal intensity. To get 
the first insight, we compared 
channel by channel the re- 
observed profiles (the detailed 
description of this procedure is 
given in Section 2.1. of Pa- 
per I). To reduce the role of 
uncertainties in the brightness 
temperature calibration and 
other possible scaling prob- 
lems, we first compared the av- 
erage channel values inside the 
usable velocity range of pro- 
files at the same sky positions 
and rejected all results for the 
positions where the dispersion 
of these averages for different profiles was more than 0.05 K (selecting even a 
smaller limit did not considerably change the results). Next we normalized all 
profiles at the same sky position to the same average channel value and compared 
all possible pairs of profiles at a given sky position channel by channel. In this 
comparison we used the average of corresponding channel values from different 
profiles as an indication of the signal level and their difference as an indication 
of the noise level. The results for small signal strengths are given with crosses 
in Fig. 12. We can see a rather unexpected behavior. Where the signal is miss- 
ing, the results once again agree very well with the mean noise estimates for the 
survey. However, already for the 0.5 K signal the uncertainties in channel values 




Fig. 12. The dependence of the total uncer- 
tainties (green crosses) in channel values and of 
the observational noise (blue pluses) on the sig- 
nal strength. The thick solid red line indicates ex- 
trapolation of the noise level in regions containing 
the line emission into the signal-free region. 
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have increased nearly twice and only after this the uncertainties grow more or less 
linearly with the signal, as expected from the radiometer equation. 

What is described above, is a rather direct estimation of the uncertainties in 
channel values at different levels of the signal. However these estimates contain 
not only observational noise, but they may also have been considerably increased 
by uncertainties in data reduction procedures (baseline, stray-radiation etc.). To 
study the pure noise, we must reject other sources of differences between the reob- 
served profiles. To some extent, this can be done by smoothing every profile and 
taking the smoothed version of the profile as an estimate of the signal behavior 
and the differences between original and smoothed versions as an estimate of the 
noise. For such smoothing we used the Savitzky-Golay filters of the second degree 
with different window widths. In Fig. 12 the plus signs show the results for (2,2,2) 
filter. Once again, the results for signal-free regions agree well with other esti- 
mates, but in the region of signal strengths of about 0.02 — 0.5 K there is a rapid 
increase in noise and only after this the noise behaves more or less as predicted by 
the radiometer equation. Of course, now the estimates of the noise strength in the 
regions containing a signal are considerably lower than those, obtained from the 
reobserved profiles, but these differences are in good agreement with the expec- 
tations by Bajaja et al. (2005) that "the necessary interpolation of the baseline 
leads to uncertainties which are enhanced by a factor of 2 to 3 over the typical 
rms uncertainties of 0.07 K as determined outside the regions with line emission". 

On the basis of the de- 
scribed results, we decided to 
slightly modify the noise esti- 
mates used in the decompo- 
sition program. We still ob- 
tain the main estimate of the 
noise strength from the signal- 
free regions of the profiles, but 
for regions with H i emission 
we allow for a 16% higher noise 
level. Such a model is indi- 
cated in Fig. 12 by the solid 
line. The 16% increase is 
chosen as a conservative value 
from the results with differ- 
ent smoothing filters. This 
estimate is not very precise, 
but we believe that the actual 
value cannot be considerably 
smaller, but according to some results may be even somewhat larger (up to about 
21%). We also rechecked the LDS data for the presence of such a jump and found 
no need for introducing a similar correction in this case. After including the de- 
scribed 16% correction into the decomposition process of the lAR data, the results 
become much more similar to those of the LDS2. In the case of the lAR, there are 
still on an average 14% more Gaussians per profile than for the LDS2, but this 
may be natural, as there is also about 27% more Hi per profile in the Southern 
sky than in the Northern sky. 




0.0 o.s 

Fig. 13. The same as Fig. 10, but for the lAR. 
The plot is scaled to the same number of profiles 
as in Fig. 10. 
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The (Tbo, FWHM) distri- 
bution of the Gaussians ob- 
tained from the decomposition 
of the lAR is given in Fig. 13. 
When comparing with Fig. 10, 
we can see that now the num- 
bers of weak narrow Gaussians 
are in both cases rather simi- 
lar, but for the JAR there are 
more strong narrow Gaussians 
than in the case of the LDS2. 
Most hkely these are the inter- 
ference induced profile compo- 
nents mentioned also by Ba- 
jaja et al. (2005). Due to 
the larger number of Gaus- 
sians per profile in the South- 
ern sky the overall extent of 
the distribution in Fig. 13 is also wider than in 
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Fig. 14. An example of the lAR profile with 
bad base-line. The notation is the same as in 
Fig. 2. 

Fig. 10, but the general shapes 
of the isolines are rather similar, except in the region —1.0 ^ Ig(rbo) ^ —0.5 
and 1.5 ^ Ig(FWHM) ^ 2.0, where the lAR seems to contain considerably more 
Gaussians than the LDS2. This is visible also from Fig. 10, where the Southern 
sky shows once again the quadrangular patterns discussed in connection with the 
baseline problems of the LDS and mentioned also by Bajaja et al. (2005). How- 
ever, in this case the quadrangular pattern is visible in narrower Gaussians than 
in the LDS, indicating that in the lAR at least some baseline defects are smaller 
in their frequency extent (Fig. 14). 



5. CONCLUSIONS 



In this paper we have mainly used the high galactic latitude part \b\ > 30° of 
the Leiden/Dwingeloo Survey of galactic neutral hydrogen by Hartmann & Bur- 
ton (1997) to demonstrate how the Gaussian analysis could be used for statistical 
cleaning of the observed H I profiles from observational noise and different obser- 
vational and reductional artifacts. The removal of most of the observational noise 
is achieved by the process of the Gaussian decomposition. The program searches 
in profiles the regions where the measured brightness temperatures are above the 
estimated noise level, and fits this excess with the Gaussian function of three free 
parameters. The Gaussians are added until the rms of the residuals becomes close 
to the initial noise level estimate of the signal-free regions of the profile (the de- 
tailed description of the decomposition process is given by Hand 2000). At this 
point the residuals are considered as pure observational noise and discarded from 
further consideration. 

We have demonstrated that not all Gaussians obtained in this way could be 
considered as representing the real H i emission of the Milky Way. A considerable 
part of the obtained Gaussians are still due to different observational, reductional 
and decompositional problems, which have occurred during this process. However, 
we have demonstrated that by analyzing the distribution of the parameters of the 
obtained Gaussians, it is possible to further clean the results by distinguishing 
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the components, modeling different artifacts from those representing the actual 
emission of the galactic hydrogen. Such a separation is easy for very narrow Gaus- 
sians, which represent the radio-interferences not found and removed during the 
reduction of the observed profiles, and the strongest random noise peaks misinter- 
preted by the decomposition program as possible signal peaks. Disregarding such 
Gaussians seems to be possible on the basis of their location in the line-width - 
intensity distribution alone. 

A considerable amount of somewhat wider weak Gaussians seem to be caused 
by the increased uncertainties in bandpass removal near the profile edges. In the 
direction of wider Gaussians this region smoothly transforms to the one domi- 
nated by Gaussians, most likely representing problems in the determination of the 
baselines of the profiles. As demonstrated by Fig. 3, at these widths our crite- 
rion based on line-widths and heights alone, becomes somewhat unreliable. There 
are considerable numbers of components with heights larger than allowed by our 
selection criterion, but still most likely not representing the real H I emission. How- 
ever, these Gaussians could easily be recognized from their velocities. These are 
the Gaussians with central velocities lying outside the velocity limits of the pro- 
file, actually used during the Gaussian decomposition. They are the components, 
modeling the cases of rising profile edges and their intensities are estimated from 
the small number of profile channels covering only the minor part of the region, 
where the corresponding Gaussian has considerable intensities. 

In the LDS the region of the widest weak Gaussians is most likely dominated 
by the baseline problems, but when dealing with these components their velocity 
information must also be considered as in the regions —390 <Vc < —230 kms~ 
and 180 <Vc< 280 kms^^ the selection criterion based on the line-widths and 
heights alone will discard also some information about HVCs. At the level of the 
present discussion it remains unclear, if and how Gaussians represent the high 
velocity dispersion halo gas reported by Kalberla et al. (1998). On the basis of 
the data presented in their paper, it seems that our selection criteria for the LDS 
discard most of this gas as well. At the same time, it seems to be impossible to 
decide on the basis of the single profile data, wether it contains the emission from 
HVDHG or a baseline problem. 

It is well known that near the galactic plane the H i 21-cm emission line profiles 
are so complicated that it is impossible to derive from the Gaussian analysis reliable 
conclusions on physical properties of the gas concentrations. Nevertheless, the 
comparison of Figs. 1 and 8 demonstrates that at least the selection criteria for 
discarding different artifacts are applicable in both cases without modifications. 
Moreover, despite clear differences in the distributions of the parameter values of 
the Gaussians, representing the Hi emission for regions |6| > 30° and |6| < 30°, 
there are still some similarities, indicating that at least for some profiles in the 
region |6| < 30° not all of the useful information may be completely "washed out" 
by velocity crowding, blending and other effects. 

The approach used for the LDS is applied also to the new LAB, where we 
may conclude that the baseline estimates for the LDS2 part of the LAB have 
been made considerably more uniform, if compared to the LDS (the conspicuous 
"chess-board" sky has disappeared), but for the lAR the baseline may still be 
somewhat questionable. Also, the problems caused by radio-interferences may be 
more severe in the case of the lAR than for the LDS2. Moreover, we point out the 
strange behavior of the observational noise in the lAR, where only the estimates 
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for signal-free regions are in agreement with those, pubhshed by the authors of the 
survey. The behavior of the noise in the regions with hne emission corresponds to 
the one expected on the basis of the radiometer equation only if we accept that the 
rms in these regions is about 15 — 21% higher than the estimate derived from the 
emission-free regions. Only after taking into account such behavior of the noise 
estimates, it is possible to obtain for the lAR the decomposition results similar to 
those of the LDS or the LDS2. 

When comparing the shapes of the distributions of Gaussian parameters for 
different version of the surveys, we may conclude that for most cases the selection 
criteria for separation of the components, most likely not related to the emission 
of galactic Hi, are rather universal. The only exception is the region of the widest 
Gaussians, which in the case of the LDS seems to be dominated by the components 
caused by the badly defined baseline, but the situation may be different for the 
LAB. Therefore, considering also the uncertainties in the determination of the 
selection criteria, those based on the LDS and given by Eqs. l(2Jl and ||2Il, are to 
some extent applicable also to the LAB. However, mostly due to the differences in 
the situation with the widest Gaussians, those indicated in Figs. 10 and 13 with 
solid red lines are preferred for the LAB. Corresponding functional expressions 
are: 

Ig(FWHM) = 0.057 * Ig(Tbo)^ - 0.067 * Ig(Tbo) + 0.094 (4) 

for line 1 and 

lg(7bo) = -0.370 * lg(FWHM)3 + 1.132 * lg(FWHM)2- 

1.567 *lg(FWHM) + 0.117 ^ '' 

for line 2 respectively. These criteria are selected on the basis of the full LAB data 
(the LDS2 and the lAR combined). Eq. 10} may be applied irrespective of the 
velocities of the Gaussians involved, but Eq. (O seems to be useful mainly for the 
relatively slow components in the velocity interval of about iVcl ^ 150 kms~ . 
At higher velocities the parameters of the Gaussians, describing the HVCs, some 
external galaxies and the survey artifacts may be rather similar and it is hard 
to decide on the basis of these parameters alone, what may be the actual source 
of the corresponding Gaussian. Eq. lO becomes also important at the extreme 
velocities near the profile edges where the probability of spurious components 
increases. The Gaussians with the central velocities outside the velocity range, 
used for the decomposition, must also be excluded as due to the uncertainties 
introduced by the bandpass removal. 

Finally, we would like to stress that all the discussion in this paper is statistical 
in its nature and the selection criteria presented could not be taken as a final truth 
for every particular profile and Gaussian component. As we hope, these criteria 
permit us to detect and reject most of the problematic cases described above and 
in this way to reduce the number of the undesirable components in the database. 
However, there are certainly cases, not detected by these criteria and also cases 
where important astrophysical signatures may be removed. Therefore, in the first 
order these criteria are useful for statistical work on the Milky Way H i. However, 
they can also be used as first guiding lines for labeling the problematic profiles 
in the surveys. For example, in the case of multiple observations at the same 
position of the LDS2, the criteria described above have been used to select the 
"best" profile which is expected to be the least problematic in the sense of the 
presence of spurious features discussed in this paper. However, the final decision 
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on the nature of sueh features must be made on the basis of the inspection of the 
actual profiles and other astronomical observations. 
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